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Abstract : An application of principal component analysis for the development of suitable statistical 
models for pre-harvest forecast of wheat yield based on biometrical characters has been illustrated in 
the present paper. The data obtained from the experiments on wheat under normal and late sowing 
situations have been utilised to develop the model. The result have revealed that the proposed model 
can provide reliable pre- harvest forecast of wheat yield in both the situations within the reasonable 
range of per cent standard error of 2.16 to 4.96 per cent. 
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Introduction : 

A reliable forecast of crop yield before the harvest 
is required by the Government for making policy decision 
in regards to procurement, distribution, buffer- stocking, 
import- export and marketing of agricultural commodities, 
while agro -based industries, traders and agriculturists 
need them for proper planning of their operations. Various 
research workers have developed pre-harvest forecast 
models for several crops based on time series data on 
crop yield and weekly data on weather variables. Notably 
among them are Agrawal et al. (1980, 1983, and 1986); 
Singh and Bapat (1988); Singh et al. (1986); Yadav et 
al. (2014); Mohd. Azfar et al. (2014); Pandey et al. 
(2014); Mohd. Azfar et al. (2015); Yadav et al. (2015) 


and Annu et al. (2015). Jain etal. (1984, 1985 and 1992b) 
have developed statistical models for forecasting crop 
yield based on biometrical characters using experimental 
and survey data in different regions of the country. Rice 
and wheat is major cereal crop of the Eastern Uttar 
Pradesh. The explanatory variables used in the general 
regression model are generally correlated and it may 
create problem in estimating model parameters. Principal 
component analysis (PCA) of explanatory variables 
provides principal components (PC) which are 
independent. First few PC |s| which explain maximum 
variability in explanatory variables are used in the model 
as explanatory variables and model parameters are easily 
estimated with reasonable. 

Therefore an attempt has been made in the paper 
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to develop pre-harvest forecast model for wheat yield 
using experimental data by applying the technique of 
principal component analysis in Faizabad district of 
Eastern Uttar Pradesh. 

Materials and Methods : 

The materials used and the methodologies employed 
for develop of forecast model based on biometrical 
characters are described below. 

Study area : 

The present study is related to Faizabad district 
(Eastern Uttar Pradesh, India) which is situated between 
26° 47’ N latitude and 82° 12’ E longitudes. It lies in the 
Eastern plain zone of Uttar Pradesh. It has an annual 
rainfall of about 1002 mm. Nearly 85 per cent of total 
precipitation is received from south- west monsoon during 
the month of July to September. However, occasional 
mild shower occur during winter season. The average 
minimum temperatures are 18.6° C and 31.3°C, 
respectively. It is liberally sourced by the Saryu 
(Ghaghara) river and its tributaries. Soils are deep alluvial, 
medium to medium heavy textured but are easily 
ploughable. The favourable climate, soil and the availability 
of ample irrigation facility make growing of wheat a 
natural choice for the area. Wheat crop is generally 
cultivated during the Rcibi season. 

Source and description of data : 

The data on yield of wheat and related biometrical 
characters were obtained from two experiments 
conducted at Main Experimental Station of Narendra 
Deva University of Agriculture and Technology 
Kumarganj, Faizabad, U.P., India. The details of the 
experiments are described below in the Table A. 

The 25 varieties of wheat were same in the both 
the experiments. The name of varieties are 

1- AKDW4021, 2- NIAW1415, 3- DBW46, 4- 
USA316, 5- DBW51, 6- DBW52, 7- HD2864, 8- 
HD2932, 9-HD2997, 10-HD2985, 11- HI1563, 12- 
HI869, 13-HI977, 14-HUW234, 15- HW5207, 16- 
MP4010, 17-MP4106, 18-MACS3742, 19-NW4035, 20- 


PDW317, 21-PBW590, 22-PBW621 23- PBW315, 24- 
RSP561 25-WHD943 

The following biometrical characters were measured 
by standard techniques used by Plant breeders and 
agronomists. 

1 . Plant population /plot (X,) 2. Plant height (Xj, 3. 
No. of tillers/plot (X3), 4. Length of ear head/plant (X 4 ), 
5. Basal girth(X 4 ), 6. Green leaves/plant(X 6 ) 7. No. of 
grain/ear head (X 7 ). 

Development of pre- harvest forecast model using 
principal component analysis : 

Principal component analysis (PCA) is a multivariate 
technique for data reduction. It is a mathematical function 
which does not require user to specify the statistical model 
or assumption about distribution of original variables. It 
may also be mentioned that principal components are 
artificial variables and often it is not possible to assign 
physical meaning to them. Further, since principal 
component analysis transforms original set of correlated 
variables to new set of uncorrelated variables, it is worth 
stressing that if original variables are uncorrelated, and 
then there is no point in carrying out principal component 
analysis. The theory of principal component analysis is 
available in many standard books on multivariate analysis 
(Anderson, 1974 and Johnson and Wichern, 2001). So its 
theoretical aspects are not presented here. 

Let x.. be the value of j th biometrical character (j= 

1. 2.. ..P) corresponding to i th variety of experiment ( i= 1, 

2.. ...n). The principal component analysis for x..’s will be 
carried out. 

Let PC, PC, PC K be first K (K< P) principal 

components explaining variability about 90 per cent of 
the total variation in x..’s. Using these K principal 
components as regressor variables and variety yield (y t ) 
as regressand, the following linear multiple regression 
model for pre-harvest forecast of crop yield has been 
proposed. 

y, = P„ + PTC,. + P 2 PC 2i + P t PC K ,+ e p i= o 2, n 

where y. is the crop yield of i th variety; P (| , p |5 p,, 

P k are model parameter and e. is error term assumed 

to follow independently normal distribution with mean 0 
and variance a 2 . 


Table A : 

: Details of the experiments 





Sr. No. 

Experiment 

Design 

Treatment 

Replication 

Plot size 

Date of sowing 

1 . 

Experiment-I 

Simple lattice design 

25 varieties 

2 

4.5 x3.0m 

25 Ih Nov. 2010 (normal sowing period) 

2. 

Experiment-II 

Simple lattice design 

25 varieties 

2 

3.5 x2.5m 

25 th Dec. 2010 (late sowing period) 
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The aforesaid model is fitted with the data by ordinary 
least square technique. 

Measures for validation the model: 

Different measures for the validation of the model 
are given bellow. 

Co-efficient of determination (r 2 ) : 

The co-efficient of determinant, i.e. R 2 is compute 
by 


where SS and SS are residual and total sum of 

res t 

square in the analysis of variance of regression, 
respectively. 

Per cent deviation of forecast yield from actual yield: 

The per cent deviation of forecast yield from actual 
yield has been computed as 

Actual yield - Forecast yield 

Per cent deviation of forecast yield = 1 x 100 

Actual yield 


Root mean square error (RMSE): 

It is also a measure for validation and comparing 
two models. The formula of RMSE is given by 


RMSE = 


- i;(oi-Ei ) 2 
n i=i 


1 



where O and the E are the observed and forecasted 
value of the crop yield, respectively, and n is the number 
of years for which forecasting has been done. 


Per cent standard error of the forecast: 

Let y f be forecast value of crop yield and X Q be the 
column vector of values of P independent variables at 
which y is forecasted then variance of y f is given by 


(Draper and Smith, 1998). 

V(y,)= 2 Xo(x'x)' 1 x 0 

where, (X 1 X) is the matrix of the sum of square 
and cross products of regressors matrix X (independent 
variables) and “ 2 is the estimated residual variance of 
the model. Therefore, the per cent standard error (CV) 
of forecast is given by 


Per cent S.E. = — — x 100 

Forecast value 


Results and Data analysis : 

Using the data on biometrical characters X p X,, X 3 , 
X 4 X 5 , X 6 and X 7 , the principal component analysis has 
been carried out for the data of both experiments. The 
results of the principal component analysis for experiment 
-I and II are given in the Table 1 . Since first five principal 
components have explained about 93 per cent of the total 
variability (Table 1), these first five principal components 
have been used as regressor variables and variety yield 
as regressand in the development of the model. The model 
has been fitted with the data of the experiment - 1 and II 
using yield of first 22 varieties by applying ordinary least 
square technique. The yields of last three varieties were 
left for validation of the model. The fitted models along 
with value of R 2 are presented in the Table 2. 

The forecast of wheat yield for the remaining three 
varieties of the wheat experiment -I and II were 
computed by applying the forecast models given in the 
Table 2. The per cent deviation of forecast, RMSE and 
per cent standard error (CV) of each forecast yield for 
both the experiments were computed and are presented 
in the Table 3 along with actual and forecast yield. 

For experiment -I it can be observed form the Table 
2 that the first principal component (PC t ) and fourth 


Table 1 : Principal component analysis 


F, 

F, 

f 3 

f 4 

f 5 

f 6 

f 7 

Experiment-I 








Eigenvalue 

2.107 

1.760 

1.182 

0.886 

0.584 

0.274 

0.208 

Variability (%) 

30.094 

25.139 

16.893 

12.654 

8.345 

3.907 

2.968 

Cumulative % 

30.094 

55.233 

72.126 

84.780 

93.125 

97.032 

100.000 

Experiment-II 








Eigenvalue 

2.484 

1.413 

1.137 

0.836 

0.601 

0.369 

0.160 

Variability (%) 

35.487 

20.189 

16.240 

11.940 

8.586 

5.275 

2.283 

Cumulative % 

35.487 

55.677 

71.917 

83.857 

92.443 

97.717 

100.000 


NB : Fj s are factors (explanatory variable) 
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Table 2: Forecast model for wheat based on experiment I and II 


Experiment Forecast model 

R 2 (%) 


I Yield = 35.747 + 0.464** pci - 0.086pc 2 - 0.884pc 3 + 1.812**pc 4 63.80* 


(0.462) (0.309) (0.334) (0.461) (0.505) 

-1.173*pc 5 

(0.570) 

II Yield = 32.142 + 0.9.62*pci-0.645pc2 + 0.136pc 3 + 1.395*pc 4 62.10* 

(0.444) (0.271) (0.375) (0.391) (0.463) 

+0.332pc 5 

(0.569) 

Note: Figures in bracket denote standard error of regression co-efficient. * and ** indicate significance of values at P<0.05 and <0.01,respectively 


Table 3 : Actual and forecast yield of wheat based on wheat experiment I and II 


Experiment 

Actual yield (q/ha) 

Forecast yield (q/ha) 

RMSE 

PSE(CV) 

I 

37.50 

36.10(3.37) 

1.11 

4.96 


35.00 

35.50(1.43) 


3.10 


35.70 

34.46 (3.47) 


3.80 

11 

30.00 

31.43 (4.70) 

0.96 

3.46 


30.50 

31.05 (1.80) 


2.81 


32.50 

33.16(2.03) 


2.16 


Note: Figure-s in brackets denote % deviation of forecast, CV: Co-efficient of variation 


principal component (PC 4 ) showed positive significant 
effect on the yield. However, the fifth principal component 
(PC 5 ) showed negative and significant effect on wheat 
yield. However, they do not carry any physical meaning 
about the relationship between y and x ’s (biometrical 
characters). The value of R 2 has been found to be 63.80, 
have which in reasonably appropriate. For experiment- 
11, the first principal component (PC,) and fourth principal 
component (PC 4 ) showed positive significant effect on 
the yield of wheat. The values R 2 was found to be 62. 10 
per cent (Table 2). 

The perusal of the Table 3 reveals that the proposed 
models based on biometrical characters by applying 
principal component analysis has provided forecast yield 
very close to the actual yield of wheat in both the 
situations of normal and late sowing of wheat. The per 
cent standard errors of the forecast yield in both the 
situations have been found to be within the reasonable 
range of 2.16 to 4.96 per cent. The per cent RMSE has 
been also found to be below 1.11 per cent. Thus, on the 
basis of the overall results of the Table 2 and 3, it can be 
concluded that the application of technique of principal 
component analysis has provided a suitable forecast 
model using biometrical characters. Therefore, the 
proposed model can be used to obtain reliable pre-harvest 
forecast of wheat yield in both the situations if the proper 
measurements on biometrical characters under 


consideration are available. 
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